Search CORE

499 research outputs found

Memory-Efficient Topic Modeling

Author: Cao Xiao-Qin
Liu Zhi-Qiang
Zeng Jia
Publication venue
Publication date: 08/06/2012
Field of study

As one of the simplest probabilistic topic modeling techniques, latent Dirichlet allocation (LDA) has found many important applications in text mining, computer vision and computational biology. Recent training algorithms for LDA can be interpreted within a unified message passing framework. However, message passing requires storing previous messages with a large amount of memory space, increasing linearly with the number of documents or the number of topics. Therefore, the high memory usage is often a major problem for topic modeling of massive corpora containing a large number of topics. To reduce the space complexity, we propose a novel algorithm without storing previous messages for training LDA: tiny belief propagation (TBP). The basic idea of TBP relates the message passing algorithms with the non-negative matrix factorization (NMF) algorithms, which absorb the message updating into the message passing process, and thus avoid storing previous messages. Experimental results on four large data sets confirm that TBP performs comparably well or even better than current state-of-the-art training algorithms for LDA but with a much less memory consumption. TBP can do topic modeling when massive corpora cannot fit in the computer memory, for example, extracting thematic topics from 7 GB PUBMED corpora on a common desktop computer with 2GB memory.Comment: 20 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

A New Approach to Speeding Up Topic Modeling

Author: Jia Zeng
Senior Member
Xiao-qin Cao
Zhi-qiang Liu
Publication venue
Publication date: 07/04/2014
Field of study

Latent Dirichlet allocation (LDA) is a widely-used probabilistic topic modeling paradigm, and recently finds many applications in computer vision and computational biology. In this paper, we propose a fast and accurate batch algorithm, active belief propagation (ABP), for training LDA. Usually batch LDA algorithms require repeated scanning of the entire corpus and searching the complete topic space. To process massive corpora having a large number of topics, the training iteration of batch LDA algorithms is often inefficient and time-consuming. To accelerate the training speed, ABP actively scans the subset of corpus and searches the subset of topic space for topic modeling, therefore saves enormous training time in each iteration. To ensure accuracy, ABP selects only those documents and topics that contribute to the largest residuals within the residual belief propagation (RBP) framework. On four real-world corpora, ABP performs around

10

100

times faster than state-of-the-art batch LDA algorithms with a comparable topic modeling accuracy.Comment: 14 pages, 12 figure

arXiv.org e-Print Archive

CiteSeerX

Cooperative Hunting by Multiple Mobile Robots Based on Local Interaction

Author: Min Tan
Nong Gu
Saeid Nahavandi
Zhi-Qiang Cao
Publication venue: 'IntechOpen'
Publication date: 01/01/2005
Field of study

IntechOpen

Deakin Research Online

Crossref

Deep Learning the Effects of Photon Sensors on the Event Reconstruction Performance in an Antineutrino Detector

Author: Cao De-Wen
Liu You-Hang
Loh Chang-Wei
Qi Ming
Qian Zhi-Qiang
Wang Wei
Yang Hai-Bo
Zhang Rui
Publication venue
Publication date: 01/01/2018
Field of study

We provide a fast approach incorporating the usage of deep learning for evaluating the effects of photon sensors in an antineutrino detector on the event reconstruction performance therein. This work is an attempt to harness the power of deep learning for detector designing and upgrade planning. Using the Daya Bay detector as a benchmark case and the vertex reconstruction performance as the objective for the deep neural network, we find that the photomultiplier tubes (PMTs) have different relative importance to the vertex reconstruction. More importantly, the vertex position resolutions for the Daya Bay detector follow approximately a multi-exponential relationship with respect to the number of PMTs and hence, the coverage. This could also assist in deciding on the merits of installing additional PMTs for future detector plans. The approach could easily be used with other objectives in place of vertex reconstruction

arXiv.org e-Print Archive

Directory of Open Access Journals

catena-Poly[[[[N′-(4-cyanobenzylidene)nicotinohydrazide]silver(I)]-μ-[N′-4-cyanobenzylidene)nicotinohydrazide]] hexafluoridoarsenate]

Author: Fen Zhi-Qiang
He Yong
Kou Chun-Hong
Ning Ai-Min
Niu Cao-Yuan
Publication venue: International Union of Crystallography
Publication date: 01/08/2009
Field of study

In the title compound, {[Ag(C14H10N4O)2]AsF6}n, the AgI ion is coordinated by two N atoms from two different pyridyl rings and one N atom from one carbonitrile group of three different N′-(4-cyanobenzylidene)nicotinohydrazide ligands in a distorted T-shaped geometry. The Ag—Ncarbonitrile bond distance is significant longer than those of Ag—Npyridyl. The bond angles around the AgI atom are also not in line with those in an ideal T-shaped geometry. One type of ligand acts as the bridge that connects AgI atoms into chains along [01]. These chains are linked to each other via N—H⋯O hydrogen bonds and Ag⋯O interactions with an Ag⋯O separation of 2.869 (2) Å. In addition, the [AsF6]− counter-anions are linked to the hydrazone groups through N—H⋯F hydrogen bonds. Four of the F atoms of the [AsF6]− anion are disordered over two sets of sites with occupancies of 0.732 (9) and 0.268 (9)

Directory of Open Access Journals

PubMed Central